Tunix Gemma Reinforcement Learning
Placeholder
1 Abstract
2 Introduction
3 Literature Review
4 Dataset
5 Methods
5.1 Convolutional Feature Extractor
5.2 Training Procedure
| Component | # Params | Trainable @ Start? | Trainable @ End |
|---|---|---|---|
| ConvNeXt-B backbone | 88 M | ❌ frozen | ✅ last 3 / 12 |
| Transformer (4 layers) | 17 M | ❌ frozen | ✅ last 1 / 4 |
| Square tokens | 64 × 1024 ≈ 66 k | ✅ | ✅ |
| Linear head | 13 k | ✅ | ✅ |
| Total | 105 M | 79 k (0.07%) | ~30 M (28.6%) |
Fig 4. Summary of model parameter counts, training visibility at start and end of training, and staged unfreezing schedule.
6 Results
6.1 Error Distribution and Exact-Match Accuracy
| Metric | Baseline ResNeXt (2023) | ConvNeXt-TE (+Tx) (Ours) |
|---|---|---|
| Mean incorrect squares / board | 3.40 | 4.33 |
| Boards with no mistakes (%) | 15.26 | 9.12 |
| Boards with ≤1 mistake (%) | 25.92 | 19.38 |
| Per-square error rate (%) | 5.31 | 5.94 |
6.2 6.2 Confusions
6.3 6.3 Qualitative
6.4 6.4 Training Dynamics
7 Discussion
8 References
Masouris, Athanasios, and Jan C. van Gemert. End-to-End Chess Recognition. Delft: Delft University of Technology, 2023. https://github.com/ThanosM97/end-to-end-chess-recognition.